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Abstract 

Robots  are  added  to  human  teams  to  increase  the 
team’s  skills  or  capabilities  but  in  order  to  get 
the  full  benefit  the  teams  must  trust  the  robots. 

We  present  an  approach  that  allows  a  robot  to 
estimate  its  trustworthiness  and  adapt  its  behav¬ 
ior  accordingly.  Additionally,  the  robot  uses  case- 
based  reasoning  to  store  previous  behavior  adap¬ 
tations  and  uses  this  information  to  perform  fu¬ 
ture  adaptations.  In  a  simulated  robotics  domain, 
we  compare  case-based  behavior  adaption  to  be¬ 
havior  adaptation  that  does  not  learn  and  show  it 
significantly  reduces  the  number  of  behaviors  that 
need  to  be  evaluated  before  a  trustworthy  behav¬ 
ior  is  found. 

1  Introduction 

Robots  can  be  important  members  of  human  teams 
if  they  provide  capabilities  that  humans  do  not  have. 
These  could  include  improved  sensory  capabilities,  com¬ 
munication  capabilities,  or  an  ability  to  operate  in  en¬ 
vironments  humans  can  not  (e.g.,  rough  terrain  or  dan¬ 
gerous  situations).  Adding  these  robots  might  be  nec¬ 
essary  for  the  team  to  meet  its  objectives  and  reduce 
human  risk.  However,  to  make  full  use  of  the  robots 
the  teammates  will  need  to  trust  them. 

This  is  especially  important  for  robots  that  operate 
autonomously  or  semi-autonomously.  In  these  situa¬ 
tions,  the  human  teammates  would  likely  issue  com¬ 
mands  or  delegate  tasks  to  the  robot  to  reduce  their 
workload  or  more  efficiently  achieve  team  goals.  A  lack 
of  trust  in  the  robot  could  result  in  the  humans  under¬ 
utilizing  the  it,  unnecessarily  monitoring  the  robot’s  ac¬ 
tions,  or  possibly  not  using  it  at  all. 

A  robot  could  be  designed  so  that  it  operates  in  a 
sufficiently  trustworthy  manner.  However,  this  may  be 
impractical  because  the  measure  of  trust  might  be  task- 
dependent,  user-dependent,  or  change  over  time  (Desai 
et  al.  2013).  For  example,  if  a  robot  receives  a  command 
from  an  operator  to  navigate  between  two  locations  in 
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a  city,  one  operator  might  prefer  the  task  be  performed 
as  quickly  as  possible  whereas  another  might  prefer  the 
task  be  performed  as  safely  as  possible  (e.g.,  avoiding 
bumping  into  any  obstacles).  Each  operator  has  dis¬ 
tinct  preferences  that  influence  how  they  will  trust  the 
robot’s  behavior,  and  these  preferences  may  conflict. 
Even  if  these  user  preferences  were  known  in  advance, 
a  change  in  context  could  also  influence  what  behaviors 
are  trustworthy.  An  operator  who  generally  prefers  a 
task  to  be  performed  quickly  would  likely  change  that 
preference  if  the  robot  was  transporting  hazardous  ma¬ 
terial,  whereas  an  operator  who  prefers  safety  would 
likely  change  their  preferences  in  an  emergency  situa¬ 
tion. 

The  ability  of  a  robot  to  behave  in  a  trustworthy 
manner  regardless  of  the  operator,  task,  or  context  re¬ 
quires  that  it  can  evaluate  its  trustworthiness  and  adapt 
its  behavior  accordingly.  The  robot  may  not  get  explicit 
feedback  about  its  trustworthiness  but  will  instead  need 
to  estimate  its  trustworthiness  based  on  its  interactions 
with  its  operator.  Such  an  estimate,  which  we  refer 
to  as  an  inverse  trust  estimate,  differs  from  traditional 
computational  trust  metrics  in  that  it  measures  how 
much  trust  another  agent  has  in  the  robot  rather  than 
how  much  trust  the  robot  has  in  another  agent.  In  this 
paper  we  examine  how  a  robot  can  estimate  the  trust  an 
operator  has  in  it,  adapt  its  behavior  to  become  more 
trustworthy,  and  learn  from  previous  adaptations  so  it 
can  perform  trustworthy  behaviors  more  quickly. 

In  the  remainder  of  this  paper  we  will  describe  our 
behavior  adaptation  approach  and  evaluate  it  in  a  sim¬ 
ulated  robotics  domain.  Section  2  presents  the  inverse 
trust  metric  and  Section  3  describes  how  it  can  be  used 
to  guide  the  robot’s  behavior.  In  Section  4,  we  evaluate 
our  case-based  behavior  adaptation  strategy  in  a  simu¬ 
lated  robotics  domain  and  report  evidence  that  it  can 
efficiently  adapt  the  robot’s  behavior  to  the  operator’s 
preferences.  Related  work  is  examined  in  Section  5  fol¬ 
lowed  by  a  discussion  of  future  work  and  concluding 
remarks  in  Section  6. 
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2  Inverse  Trust  Estimation 

Traditional  trust  metrics  are  used  to  estimate  the  trust 
an  agent  should  have  in  other  agents  (Sabater  and 
Sierra  2005).  The  agent  can  use  past  interactions  with 
those  agents  or  feedback  from  others  to  determine  their 
trustworthiness.  The  information  this  agent  uses  is 
likely  internal  to  it  and  not  directly  observable  by  a 
third  party.  In  a  robotics  context,  the  robot  will  not  be 
able  to  observe  the  information  a  human  operator  uses 
to  assess  their  trust  in  it.  Instead,  the  robot  will  need 
to  obtain  this  internal  information  to  estimate  operator 
trust. 

One  option  would  be  to  directly  ask  the  operator,  ei¬ 
ther  as  it  is  interacting  with  the  robot  (Kaniarasu  et 
al.  2013)  or  after  the  task  has  been  completed  (Jian, 
Bisantz,  and  Drury  2000;  Muir  1987),  about  how  trust¬ 
worthy  the  robot  was  behaving.  However,  this  might 
not  be  practical  in  situations  that  are  time-sensitive  or 
where  there  would  be  a  significant  delay  between  when 
the  robot  wishes  to  evaluate  its  trustworthiness  and  the 
next  opportunity  to  ask  the  operator  (e.g.,  during  a 
multi-day  search  and  rescue  mission).  An  alternative 
that  does  not  require  direct  operator  feedback  is  for 
the  robot  to  infer  the  trust  the  operator  has  in  it. 

Factors  that  influence  human-robot  trust  can  be 
grouped  into  three  main  categories  (Oleson  et  al. 
2011):  robot-related  factors  (e.g.,  performance,  phys¬ 
ical  attributes),  human-related  factors  (e.g.,  engage¬ 
ment,  workload,  self-confidence),  and  environmental 
factors  (e.g.,  group  composition,  culture,  task  type). 
Although  these  factors  have  all  been  shown  to  influence 
human-robot  trust,  the  strongest  indicator  of  trust  is 
robot  performance  (Hancock  et  al.  2011;  Carlson  et  al. 
2014).  Kaniarasu  et  al.  (2012)  have  used  an  inverse 
trust  metric  that  estimates  robot  performance  based 
on  the  number  of  times  the  operator  warns  the  robot 
about  its  behavior  and  the  number  of  times  the  opera¬ 
tor  takes  manual  control  of  the  robot.  They  found  this 
metric  aligns  closely  with  the  results  of  trust  surveys 
performed  by  the  operators.  However,  this  metric  does 
not  take  into  account  factors  of  the  robot’s  behavior 
that  increase  trust. 

The  inverse  trust  metric  we  use  is  based  on  the  num¬ 
ber  of  times  the  robot  completes  an  assigned  task,  fails 
to  complete  a  task,  or  is  interrupted  while  perform¬ 
ing  a  task.  An  interruption  occurs  when  the  operator 
tells  the  robot  to  stop  its  current  autonomous  behavior. 
Our  robot  infers  that  any  interruptions  are  a  result  of 
the  operator  being  unsatisfied  with  the  robot’s  perfor¬ 
mance.  Similarly,  our  robot  assumes  the  operator  will 
be  unsatisfied  with  any  failures  and  satisfied  with  any 
completed  tasks.  Interrupts  could  also  be  a  result  of 
a  change  in  the  operator’s  goals,  or  failures  could  be  a 
result  of  unachievable  tasks,  but  the  robot  works  under 
the  assumption  that  those  situations  occur  rarely. 

Our  control  strategy  estimates  whether  trust  is  in¬ 
creasing,  decreasing,  or  remaining  constant  over  peri¬ 
ods  of  time  related  to  how  long  the  robot  has  been 
performing  its  current  behavior.  For  example,  if  the 


robot  modifies  its  behavior  at  time  t-A  in  an  attempt 
to  perform  more  trustworthy  behavior,  the  trust  value 
will  be  estimated  using  information  from  tA  onward. 
We  evaluate  the  trust  value  between  times  tA  and  ts 
as  follows: 

n 

Trust a-b  =  x  cmdi , 

i= 1 

where  there  were  n  commands  issued  to  the  robot  be¬ 
tween  tA  and  ts-  If  the  ith  command  (1  <  *  <  n) 
was  interrupted  or  failed  it  will  decrease  the  trust  value 
and  if  it  was  completed  successfully  it  will  increase  the 
trust  value  ( cmdi  G  { — 1, 1})-  The  zth  command  will 
also  receive  a  weight  ( Wi  =  [0, 1])  related  to  the  robot’s 
behavior  (e.g.,  a  command  that  was  interrupted  because 
the  robot  performed  a  behavior  slowly  would  likely  be 
weighted  less  than  an  interruption  because  the  robot 
injured  a  human). 

3  Trust-Guided  Behavior  Adaptation 
Using  Case-Based  Reasoning 

The  robot  uses  the  inverse  trust  estimate  to  infer  if  its 
current  behavior  is  trustworthy,  is  not  trustworthy,  or 
it  does  not  yet  know.  Two  threshold  values  are  used 
to  identify  trustworthy  and  untrustworthy  behavior: 
the  trustworthy  threshold  ( tt )  and  the  untrustworthy 
threshold  (tut)-  Our  robot  uses  the  following  tests: 

•  If  the  trust  value  reaches  the  trustworthy  threshold 
(Trust a-b  >  tt),  the  robot  will  conclude  it  has 
found  a  sufficiently  trustworthy  behavior. 

•  If  the  trust  value  falls  below  the  untrustworthy 
threshold  (Trust- a-b  <  tut),  the  robot  will  modify 
its  behavior  in  an  attempt  to  use  a  more  trustworthy 
behavior. 

•  If  the  trust  value  is  between  the  two  thresholds 
( tut  <  Trust  a-b  <  tt),  the  robot  will  continue 
to  evaluate  the  operator’s  trust. 

In  the  situations  where  the  trustworthy  threshold  has 
been  reached  or  neither  threshold  has  been  reached, 
the  robot  will  continue  to  use  its  current  behavior. 
However,  when  the  untrustworthy  threshold  has  been 
reached  the  robot  will  modify  its  behavior  in  an  at¬ 
tempt  to  behave  in  a  more  trustworthy  manner.  The 
ability  of  the  robot  to  modify  its  own  behavior  is  guided 
by  the  number  of  behavioral  components  that  it  can 
modify.  These  modifiable  components  could  include 
changing  an  algorithm  used  (e.g.,  switching  between 
two  path  planning  algorithms),  changing  parameter  val¬ 
ues  it  uses,  or  changing  data  that  is  being  used  (e.g., 
using  a  different  map  of  the  environment).  Each  modi¬ 
fiable  component  i  will  have  a  set  C-,  of  possible  values 
that  the  component  can  be  selected  from. 

If  the  robot  has  m  components  of  its  behavior  that 
can  be  modified,  its  current  behavior  B  will  be  a  tu¬ 
ple  containing  the  currently  selected  value  Cj  for  each 
modifiable  component  (c,  G  G): 


B  (Cl  ,  C2  ,  •  •  •  ,  Cm) 

When  a  behavior  _B  was  found  by  the  robot  to  be  un¬ 
trustworthy  it  is  stored  as  an  evaluated  pair  E  that  also 
contains  the  time  t  it  took  the  behavior  to  be  labeled 
as  untrustworthy: 

E=(B,t) 

The  time  it  took  for  a  behavior  to  reach  the  untrustwor¬ 
thy  threshold  is  used  to  compare  behaviors  that  have 
been  found  to  be  untrustworthy.  A  behavior  B  that 
reaches  the  untrustworthy  threshold  sooner  than  an¬ 
other  behavior  B  (t  <  t  )  is  assumed  to  be  less  trust¬ 
worthy  than  the  other.  This  is  based  on  the  assumption 
that  if  a  behavior  took  longer  to  reach  the  untrustwor¬ 
thy  threshold  then  it  was  likely  performing  some  trust¬ 
worthy  behaviors  or  was  not  performing  untrustworthy 
behaviors  as  quickly. 

As  the  robot  evaluates  behaviors,  it  stores  a  set 
£past  °f  previously  evaluated  behaviors  ( £past  = 
{Ei,E2,  ■  .  .  ,En}).  It  continues  to  add  to  this  set  un¬ 
til  it  locates  a  trustworthy  behavior  Bfinai  (when  the 
trustworthy  threshold  is  reached) .  The  set  of  evaluated 
behaviors  can  be  thought  of  as  the  search  path  that 
resulted  in  the  final  solution  (the  trustworthy  behav¬ 
ior).  The  search  path  information  is  potentially  useful 
because  if  the  robot  can  determine  it  is  on  a  similar 
search  path  that  it  has  previously  encountered  (simi¬ 
lar  behaviors  being  labeled  untrustworthy  in  a  similar 
amount  of  time)  then  the  robot  can  identify  what  final 
behavior  it  should  attempt.  To  allow  for  the  reuse  of 
past  behavior  adaptation  information  we  use  case-based 
reasoning  (Richter  and  Weber  2013). 

Each  case  C  is  composed  of  a  problem  and  a  solution. 
In  our  context,  the  problem  is  the  previously  evaluated 
behaviors  and  the  solution  is  the  final  trustworthy  be¬ 
havior: 

E  —  {£past>  Bfina\) 

These  cases  are  stored  in  a  case  base  and  represent  the 
robot’s  knowledge  about  previous  behavior  adaptation. 

When  the  robot  modifies  its  behavior  it  selects  new 
values  for  one  or  more  of  the  modifiable  components. 
The  new  behavior  Bnew  is  selected  as  a  function  of  all 
behaviors  that  have  been  previously  evaluated  for  this 
operator  and  its  case  base  CB: 

Bnew  =  selectBehavior(£PastiCB) 

The  selectBehavior  function  (Algorithm  1)  attempts 
to  use  previous  adaptation  experience  to  guide  the  cur¬ 
rent  adaptation.  The  algorithm  iterates  through  each 
case  in  the  case  base  and  checks  to  see  if  that  case’s 
final  behavior  has  already  been  evaluated  by  the  robot. 
If  the  behavior  has  been  evaluated,  that  means  the 
robot  has  already  found  the  behavior  to  be  untrust¬ 
worthy  so  the  robot  does  not  try  to  use  it  again.  The 
remaining  cases  have  their  set  of  evaluated  behaviors 


( Ci.Spast )  compared  to  the  robot’s  current  set  of  eval¬ 
uated  behaviors  ( £past )•  The  most  similar  case’s  final 
behavior  is  returned  and  will  be  used  by  the  robot.  If 
no  such  behaviors  are  found  (the  final  behaviors  of  all 
cases  have  been  examined  or  the  case  base  is  empty), 
the  modi  f  y Behavior  function  is  used  to  select  the  next 
behavior  to  perform.  It  selects  an  evaluated  behav¬ 
ior  Emax  that  took  the  longest  to  reach  the  untrust¬ 
worthy  threshold  (VE)  €  £Past(Emax.t  >  Ei.t ))  and 
performs  a  random  walk  (without  repetition)  to  find 
a  behavior  Bnew  that  required  the  minimum  number  of 
changes  from  Emax.B  and  has  not  already  been  evalu¬ 
ated  (VE*  €  £past{Bnew  ^  Ei.B )).  If  all  possible  behav¬ 
iors  have  been  evaluated  and  found  to  be  untrustworthy 
the  robot  will  stop  adapting  its  behavior  and  use  the 
behavior  from  Emax. 


Algorithm  1:  Selecting  a  New  Behavior 

Function:  selectBehavior (£past,  CB)  returns 
Bnew  > 

bestSim  4—  0;  Bbest  4—  0; 
foreach  C-i  £  CB  do 

if  Ci.B finai  cf  £past  then 

simi  4  sim{£past,  Ci.£past); 
if  simi  >  bestSim  then 
bestSim  4—  simp, 

Bbest  ^  Ci.B  final', 

if  Bbest  =  0  then 

L  Bbest  «-  modify  Behavior  {£Past)\ 
return  Bbest-, 


The  similarity  between  two  sets  of  evaluated  behav¬ 
iors  (Algorithm  2)  is  complicated  by  the  fact  that  the 
sets  may  vary  in  size.  The  size  of  the  sets  depend  on  the 
number  of  previous  behaviors  that  were  evaluated  by 
the  robot  in  each  set  and  there  is  no  guarantee  that  the 
sets  contain  identical  behaviors.  To  account  for  this,  the 
similarity  function  looks  at  the  overlap  between  the  two 
sets  and  ignores  behaviors  that  have  been  examined  in 
only  one  of  the  sets.  Each  evaluated  behavior  in  the  first 
set  has  its  behavior  matched  to  an  evaluated  behavior 
Emax  in  the  second  set  that  contains  the  most  similar 
behavior  (, sim(BA,BB )  =  ^  Y<7=i  sim(BA.Ci,  Bs-cf), 
where  the  similarity  function  will  depend  on  the  spe¬ 
cific  type  of  behavior  component).  If  those  behaviors 
are  similar  enough,  based  on  a  threshold  A,  then  the 
similarity  of  the  time  components  of  these  evaluated  be¬ 
haviors  are  included  in  the  similarity  calculation.  This 
ensures  that  only  matches  between  evaluated  behaviors 
that  are  highly  similar  (i.e.,  similar  behaviors  exist  in 
both  sets)  are  included  in  the  similarity  calculation. 

4  Evaluation 

In  this  section,  we  describe  an  evaluation  for  our  claim 
that  the  case-based  reasoning  approach  is  able  to  adapt 


Algorithm  2:  Similarity  between  sets  of  evaluated 
behaviors 


Function:  sim(£A,  £b)  returns  sim; 


total  Sim.  <—  0;  num  <—  0; 
foreach  E,  £  £4  do 

Emax  argrnax  (sim(Ei.B,  Ej.B))-, 

Ej^lEb 

if  sim{Ei.B1Emax.B)  >  A  then 

totalSim  <—  totalSim+sim(Ei.t,  Emax.t)\ 
num.  ■£-  num  +  1; 


if  num  =  0  then 
j  return  0; 


return 


totalSim  . 
num  ’ 


to  and  perform  trustworthy  behaviors  more  quickly  that 
a  random  walk  approach.  We  conducted  this  study  in 
a  simulated  environment  with  a  simulated  robot  and 
operator. 

4.1  eBot Works  Simulator 

Our  evaluation  uses  the  eBotworks  simulation  environ¬ 
ment  (Knexus  Research  Corporation  2013).  eBotworks 
is  a  multi-agent  simulation  engine  and  testbed  that  al¬ 
lows  for  multimodal  command  and  control  of  unmanned 
systems.  It  allows  for  autonomous  agents  to  control 
simulated  robotic  vehicles  while  interacting  with  hu¬ 
man  operators,  and  for  the  autonomous  behavior  to  be 
observed  and  evaluated. 

We  use  a  simulated  urban  environment  (Figure  1) 
containing  landmarks  (e.g.,  roads)  and  objects  (e.g., 
houses,  humans,  traffic  cones,  vehicles,  road  barri¬ 
ers).  The  robot  is  a  wheeled  unmanned  ground  vehicle 
(UGV)  and  uses  eBotwork’s  built-in  natural  language 
processing  (for  interpreting  user  commands),  locomo¬ 
tion,  and  path-planning  modules.  The  actions  per¬ 
formed  by  a  robot  in  eBotworks  are  non-deterministic 
(e.g.,  the  robot  cannot  anticipate  its  exact  position  after 
moving) . 


Figure  1:  Simulated  urban  environment  in  eBotworks 


4.2  Simulated  Operator 

In  this  study  we  will  use  a  simulated  operator  to  issue 
commands  to  the  robot.  The  simulated  operator  as¬ 
sesses  its  trust  in  the  robot  using  three  factors  of  the 
robot’s  performance: 

•  Task  duration:  The  simulated  operator  has  an  ex¬ 
pectation  about  the  amount  of  time  that  the  task 
will  take  to  complete  ( tcompiete )•  If  the  robot  does 
not  complete  the  task  within  that  time,  the  opera¬ 
tor  may,  with  probability  pa,  interrupt  the  robot  and 
issue  another  command. 

•  Task  completion:  If  the  operator  determines  that 
the  robot  has  failed  to  complete  the  task  (e.g.,  the 
UGV  is  stuck),  it  will  interrupt. 

•  Safety:  The  operator  may  interrupt  the  robot,  with 
probability  p7,  if  the  robot  collides  with  any  obstacles 
along  the  route. 

4.3  Movement  Scenario 

The  simple  task  the  robot  is  required  to  perform  in¬ 
volves  moving  between  two  locations  in  the  environ¬ 
ment.  At  the  start  of  each  run,  the  robot  will  be  placed 
in  the  environment  and  the  simulated  operator  will  is¬ 
sue  a  command  for  the  robot  to  move  to  a  goal  loca¬ 
tion.  Based  on  the  robot’s  performance  (task  duration, 
task  completion,  and  safety),  the  operator  will  allow 
the  robot  to  complete  the  task  or  interrupt  it.  When 
the  robot  completes  a  task,  fails  to  complete  it,  or  is 
interrupted,  the  scenario  will  be  reset  by  placing  the 
robot  back  at  the  start  location  and  the  operator  will 
issue  another  command. 

We  us  three  simulated  operators: 

•  Speed-focused  operator:  This  operator  prefers 
the  robot  to  move  to  the  destination  quickly  regard¬ 
less  of  whether  it  hits  any  obstacles  ( tcornpiete  =  15 
seconds,  pa  =  95%,  p1  =  5%). 

•  Safety-focused  operator:  This  operator  prefers 
the  robot  to  avoid  obstacles  regardless  of  how  long 
it  takes  to  reach  the  destination  ( tcompiete  =  15  sec¬ 
onds,  pa  =  5%,  p7  =  95%). 

•  Balanced  operator:  This  operator  prefers  a  bal¬ 
anced  mixture  of  speed  and  safety  ( tcornpiete  =  15 
seconds,  pa  =  95%,  p7  =  95%). 

Each  of  the  three  simulated  operators  will  control  the 
robot  for  500  experimental  trials,  with  each  trial  termi¬ 
nating  when  the  robot  determines  it  has  found  a  trust¬ 
worthy  behavior.  For  the  case-based  approach,  a  case 
is  added  to  the  case  base  at  the  end  of  any  trial  where 
the  robot  performs  at  least  one  random  walk  adapta¬ 
tion  of  its  behavior.  When  no  random  walk  adaptations 
are  performed,  the  robot  was  able  to  find  a  trustworthy 
behavior  using  the  cases  in  its  case  base  so  there  is  no 
need  to  add  another  case. 

The  robot  has  two  modifiable  behavior  components: 
speed  (meters  per  second)  and  obstacle  padding  (me¬ 
ters).  Speed  relates  to  how  fast  the  robot  can  move 


and  obstacle  padding  relates  to  the  distance  the  robot 
will  attempt  to  keep  from  obstacles  during  movement. 
The  set  of  possible  values  for  each  modifiable  compo¬ 
nent  (C speed  and  Cpadding)  are  determined  from  mini¬ 
mum  and  maximum  values  with  fixed  increments. 

Cspeed  =  {0.5,  1.0,...,  10.0} 

Cpadding  =  {0.1,  0.2,  0.3,  ...,  2.0} 

We  test  our  robot  using  a  trustworthy  threshold  of 
tt  =  5.0  and  an  untrustworthy  threshold  of  tut  = 
—5.0.  When  calculating  the  similarity  between  sets  of 
evaluated  behaviors  the  robot  uses  a  similarity  thresh¬ 
old  of  A  =  0.95  (behaviors  must  be  95%  similar  to  be 
matched) . 

4.4  Results 

We  found  that  both  the  case-based  behavior  adapta¬ 
tion  and  the  random  walk  behavior  adaptation  strate¬ 
gies  resulted  in  similar  trustworthy  behaviors  for  each 
simulated  operator.  For  the  speed-focused  operator,  the 
trustworthy  behaviors  had  higher  speeds  regardless  of 
padding  (3.5  <  speed  <  10.0,  0.1  <  padding  <  1.9). 
The  safety-focused  operator  had  higher  padding  regard¬ 
less  of  speed  (0.5  <  speed  <  10.0,  0.4  <  padding  <  1.9). 
Finally,  the  balanced  operator  had  higher  speed  and 
higher  padding  (3.5  <  speed  <  10.0,  0.4  <  padding  < 
1.9).  In  addition  to  having  similar  value  ranges,  there 
were  no  statistically  significant  differences  between  the 
distributions  of  those  values  for  the  two  strategies. 

The  difference  between  the  two  behavior  adaption  ap¬ 
proaches  was  related  to  the  number  of  behaviors  that 
needed  to  be  evaluated  before  a  trustworthy  behavior 
was  found.  Table  1  shows  the  mean  number  of  evalu¬ 
ated  behaviors  (and  95%  confidence  interval)  when  in¬ 
teracting  with  each  operator  type  (over  500  trials  for 
each  operator).  In  addition  to  being  controlled  by  only 
a  single  operator,  we  also  examined  a  condition  in  which 
the  operator  is  selected  at  random  with  equal  probabil¬ 
ity.  This  represents  a  more  realistic  scenario  where  the 
robot  will  be  required  to  interact  with  a  variety  of  oper¬ 
ators  without  any  knowledge  about  which  operator  will 
control  it. 


Table  1:  The  mean  number  of  behaviors  evaluated 


Operator 

Random  Walk 

Case-based 

Speed-focused 

20.3  (±3.4) 

1.6  (±0.2) 

Safety-focused 

2.8  (±0.3) 

1.3  (±0.1) 

Balanced 

27.0  (±3.8) 

1.8  (±0.2) 

Random 

14.6  (±2.9) 

1.6  (±0.1) 

The  case-based  approach  required  significantly  fewer 
behaviors  to  be  evaluated  in  all  four  experiments  (using 
a  paired  t-test  withp  <  0.01).  This  is  because  the  case- 
based  approach  was  able  to  learn  from  previous  adapta¬ 
tions  and  use  that  information  to  quickly  find  trustwor¬ 
thy  behaviors.  At  the  beginning,  when  the  robot’s  case 
base  is  empty,  the  case-based  approach  is  required  to 
perform  adaptation  that  is  similar  to  the  random  walk 


approach.  As  the  case  base  size  grows,  the  number 
of  times  random  walk  adaptation  is  required  decreases 
until  the  agent  generally  only  performs  a  single  case- 
based  behavior  adaptation  before  finding  a  trustwor¬ 
thy  behavior.  Even  when  the  case  base  contains  cases 
from  all  three  simulated  operators,  the  case-based  ap¬ 
proach  can  quickly  differentiate  between  the  users  and 
select  a  trustworthy  behavior.  The  number  of  adapta¬ 
tions  required  for  the  safety-focused  operator  was  lower 
than  for  the  other  operators  because  a  higher  percent¬ 
age  of  behaviors  are  considered  trustworthy.  The  robot, 
which  started  the  experiments  for  each  operator  with 
an  empty  case  base,  collected  24  cases  when  interacting 
with  the  speed-focused  operator,  18  cases  when  inter¬ 
acting  with  the  safety-focused  operator,  33  cases  when 
interacting  with  the  balanced  operator,  and  33  cases 
when  interacting  with  a  random  operator. 

The  primary  limitation  of  the  case-based  approach  is 
that  it  relies  on  the  random  walk  search  when  it  does 
not  have  any  suitable  cases  to  use.  Although  the  mean 
number  of  behaviors  evaluated  by  the  case-based  ap¬ 
proach  is  low,  the  situations  where  random  walk  is  used 
(and  a  new  case  is  created)  require  an  above-average 
number  of  behaviors  to  be  evaluated  (closer  to  the  mean 
number  of  behaviors  evaluated  when  only  random  walk 
is  used).  The  case-based  approach  uses  random  walk 
infrequently,  so  there  is  not  a  large  impact  on  the  mean 
number  of  behaviors  evaluated  over  500  trials,  but  this 
would  be  an  important  concern  as  the  problem  scales  to 
use  more  complex  behaviors  with  more  modifiable  com¬ 
ponents.  Two  primary  solutions  exist  to  improve  per¬ 
formance  in  more  complex  domains:  improved  search 
and  seeding  of  the  case  base.  Random  walk  search  was 
used  because  it  requires  no  explicit  knowledge  about  the 
domain  or  the  task.  However,  a  more  intelligent  search 
that  could  identify  relations  between  interruptions  and 
modifiable  components  (e.g.,  an  interruption  when  the 
robot  is  very  close  to  objects  requires  a  change  to  the 
padding  value)  would  likely  improve  adaptation  time. 
Since  a  higher  number  of  behaviors  need  to  be  evalu¬ 
ated  when  new  cases  are  created,  if  a  set  of  initial  cases 
were  provided  to  the  robot  it  would  be  able  to  decrease 
the  number  of  random  walk  adaptations  (or  adaptations 
requiring  a  different  search  technique)  it  would  need  to 
perforin. 

5  Related  Work 

In  addition  to  Kaniarasu  et  al.  (2012),  Saleh  et  al. 
(2012)  have  also  proposed  a  measure  of  inverse  trust 
and  use  a  set  of  expert-authored  rules  to  measure  trust. 
Unlike  our  own  work,  while  these  approaches  measure 
trust,  they  do  not  use  this  information  to  adapt  be¬ 
havior.  Shapiro  and  Sliacliter  (2002)  discuss  the  need 
for  an  agent  to  act  in  the  best  interests  of  a  user  even 
if  that  requires  sub-optimal  performance.  Their  work 
examines  identifying  factors  that  influence  the  user’s 
utility  function  and  updating  the  agent’s  reward  func¬ 
tion  accordingly.  This  is  similar  to  our  own  work  in 
that  behavior  is  modified  to  align  with  a  user’s  prefer- 


ence,  but  our  robot  is  not  given  an  explicit  model  of  the 
user’s  reasoning  process. 

Conversational  recommender  systems  (McGinty  and 
Smyth  2003)  iteratively  improve  recommendations  to 
a  user  by  tailoring  the  recommendations  to  the  user’s 
preferences.  As  more  information  is  obtained  through 
dialogs  with  a  user,  these  systems  refine  their  model  of 
that  user.  Similarly,  learning  interface  agents  observe 
a  user  performing  a  task  (e.g.,  sorting  e-mail  (Maes 
and  Kozierok  1993)  or  schedule  management  (Horvitz 
1999))  and  learn  the  user’s  preferences.  Both  conver¬ 
sational  recommender  systems  and  learning  interface 
agents  are  designed  to  learn  preferences  for  a  single 
task  whereas  our  behavior  adaptation  requires  no  prior 
knowledge  about  what  tasks  will  be  performed. 

Our  work  also  has  similarities  to  other  areas  of  learn¬ 
ing  during  human-robot  interactions.  When  a  robot 
learns  from  a  human,  it  is  often  beneficial  for  the  robot 
to  understand  the  environment  from  the  perspective  of 
the  human.  Breazeal  et  al.  (2009)  have  examined  how 
a  robot  can  learn  from  a  cooperative  human  teacher  by 
mapping  its  sensory  inputs  to  how  it  estimates  the  hu¬ 
man  is  viewing  the  environment.  This  allows  the  robot 
to  learn  from  the  viewpoint  of  the  teacher  and  possibly 
discover  information  it  would  not  have  noticed  from  its 
own  viewpoint.  This  is  similar  to  preference-based  plan¬ 
ning  systems  that  learn  a  user’s  preferences  for  plan 
generation  (Li,  Kambhampati,  and  Yoon  2009).  Like 
our  own  work,  these  systems  involve  inferring  informa¬ 
tion  about  the  reasoning  of  a  human.  However,  they 
differ  in  that  they  involve  observing  a  teacher  demon¬ 
strate  a  specific  task  and  learning  from  those  demon¬ 
strations. 

6  Conclusions 

In  this  paper  we  have  presented  an  inverse  trust  mea¬ 
sure  to  estimate  an  operator’s  trust  in  a  robot’s  behav¬ 
ior  and  to  adapt  its  behavior  to  increase  an  operator’s 
trust.  The  robot  also  learns  from  previous  behavior 
adaptations  using  case-based  reasoning.  Each  time  it 
successfully  finds  a  trustworthy  behavior,  it  records  that 
behavior  as  well  as  the  untrustworthy  behaviors  that  it 
evaluated. 

We  evaluated  our  trust-guided  behavior  adaptation 
algorithm  in  a  simulated  robotics  environment  by  com¬ 
paring  it  to  a  behavior  adaptation  algorithm  that  does 
not  learn  from  previous  adaptations.  Both  approaches 
converge  to  trustworthy  behaviors  for  each  type  of  op¬ 
erator  (speed-focused,  safety-focused  and  balanced)  but 
the  case-based  algorithm  requires  significantly  fewer  be¬ 
haviors  to  be  evaluated  before  a  trustworthy  behavior 
is  found.  This  is  advantageous  because  the  chances 
that  the  operator  will  stop  using  the  robot  increases 
the  longer  the  robot  is  behaving  in  an  untrustworthy 
manner. 

Although  we  have  shown  the  benefits  of  trust-guided 
behavior  adaptation,  several  areas  of  future  work  ex¬ 
ist.  We  have  only  evaluated  the  behavior  in  a  simple 
movement  scenario  but  will  soon  test  it  on  increasingly 


complex  tasks  where  the  robot  has  more  behavior  com¬ 
ponents  that  it  can  modify  (e.g.,  scouting  for  hazardous 
devices  in  an  urban  environment).  In  longer  scenarios 
it  may  be  important  to  not  only  consider  undertrust, 
as  we  have  done  in  this  work,  but  also  overtrust.  In  sit¬ 
uations  of  overtrust,  the  operator  may  trust  the  robot 
too  much  and  allow  the  robot  to  behave  autonomously 
even  when  it  is  performing  poorly.  We  also  plan  to  in¬ 
clude  other  trust  factors  in  the  inverse  trust  estimate 
and  add  mechanisms  that  promote  transparency  be¬ 
tween  the  robot  and  operator.  More  generally,  adding 
an  ability  for  the  robot  to  reason  about  its  own  goals 
and  the  goals  of  the  operator  would  allow  the  robot  to 
verify  it  is  trying  to  achieve  the  same  goals  as  the  op¬ 
erator  and  identify  any  unexpected  goal  changes  (e.g., 
such  as  when  a  threat  occurs). 
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